Distant Supervised Relation Extraction with Wikipedia and Freebase

نویسندگان

  • Marcel Ackermann
  • TU Darmstadt
  • Juan Ramón Jiménez
چکیده

In this paper we discuss a new approach to extract relational data from unstructured text without the need of hand labeled data. Socalled distant supervision has the advantage that it scales large amounts of web data and therefore fulfills the requirement of current information extraction tasks. As opposed to supervised machine learning we train generic, relationand domain-independent extractors on the basis of data base entries. We use Freebase as a source of relational data and a Wikipedia corpus tagged with unsupervised word classes. In contrast to previous work in the field of distant supervision, we do not rely on preprocessing steps that involve supervised learning. This work consists of three parts, a distant supervised Named Entity Recognizer (NER), a distant supervised classifier to recognize sentences in which a certain relation between two objects is described and the combination of both, allowing us for example to contribute new instances to Freebase. The performance of the NER is too low, that the combined method produces usable results. Still the subcomponents can be used independently.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Passage Retrieval for Information Extraction using Distant Supervision

In this paper, we propose a keyword-based passage retrieval algorithm for information extraction, trained by distant supervision. Our goal is to be able to extract attributes of people and organizations more quickly and accurately by first ranking all the potentially relevant passages according to their likelihood of containing the answer and then performing a traditional deeper, slower analysi...

متن کامل

Distant supervision for relation extraction without labeled data

Modern models of relation extraction for tasks like ACE are based on supervised learning of relations from small hand-labeled corpora. We investigate an alternative paradigm that does not require labeled corpora, avoiding the domain dependence of ACEstyle algorithms, and allowing the use of corpora of any size. Our experiments use Freebase, a large semantic database of several thousand relation...

متن کامل

Distant Supervision for Entity Linking

Entity linking is an indispensable operation of populating knowledge repositories for information extraction. It studies on aligning a textual entity mention to its corresponding disambiguated entry in a knowledge repository. In this paper, we propose a new paradigm named distantly supervised entity linking (DSEL), in the sense that the disambiguated entities that belong to a huge knowledge rep...

متن کامل

MSIIPL THU’s Slot-Filling Method for TAC-KBP 2015

This paper presents the design and implementation of our first English slot filling system. The slot filling task aims at extracting attribute values of the given entities. The core of the system is a set of supervised per-relation classifiers, trained by a scheme known as distant supervision. We use Freebase and Wikipedia to generate our training query-filler pairs. Annoted Gigaword received f...

متن کامل

Collective Cross-Document Relation Extraction Without Labelled Data

We present a novel approach to relation extraction that integrates information across documents, performs global inference and requires no labelled text. In particular, we tackle relation extraction and entity identification jointly. We use distant supervision to train a factor graph model for relation extraction based on an existing knowledge base (Freebase, derived in parts from Wikipedia). F...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012